|
|
|
| Home | Wayback Machine | Archive-It | Blog | Heritrix |
| Anonymous User (login or join us) |
Internet Archive crawldata from Aaron Swartz Crawl, captured by crawl345.us.archive.org:aaronswartz from Sun Jan 20 14:33:21 PST 2013 to Sun Jan 20 07:02:45 PST 2013.
This item is part of the collection: Away from Keyboard: Aaron H. Swartz
Identifier: AS-20130120143321-crawl345
Contributor: Internet Archive
Crawljob: aaronswartz
Creator: Internet Archive
Date: 2013
Firstfiledate: 20130120143302
Firstfileserial: 02039
Identifier-access: http://www.archive.org/details/AS-20130120143321-crawl345
Lastdate: 20130120070245
Lastfiledate: 20130120145608
Lastfileserial: 02045
Mediatype: web
Numwarcs: 7
Operator: lekash@archive.org
Scandate: 20130120143302
Scanner: crawl345.us.archive.org
Scanningcenter: sanfrancisco
Sizehint: 10568202426
Sponsor: Internet Archive
Publicdate: 2013-01-20 21:41:38
Addeddate: 2013-01-20 21:41:38
Imagecount: 46319
Keywords: crawldata
| Information | Format | Size |
| AS-20130120143321-crawl345_files.xml | Metadata | [file] |
| AS-20130120143321-crawl345_meta.xml | Metadata | 1.4 KB |
| Other Files | Web ARChive GZ | WARC CDX Index | Item CDX Index | Item CDX Meta-Index | Text |
| AS-20130120143321-02039.warc.gz |
1.8 GB
|
11.1 KB
|
|||
| AS-20130120143321-crawl345.cdx.gz |
2.8 MB
|
||||
| AS-20130120143321-crawl345.cdx.idx |
2.1 KB
|
||||
| AS-20130120143831-02040.warc.gz |
1.3 GB
|
902.7 KB
|
|||
| AS-20130120144440-02041.warc.gz |
1.0 GB
|
1.0 KB
|
|||
| AS-20130120144642-02042.warc.gz |
955.6 MB
|
654.9 KB
|
|||
| AS-20130120144901-02043.warc.gz |
953.7 MB
|
932.0 KB
|
|||
| AS-20130120145214-02044.warc.gz |
1.0 GB
|
517.6 KB
|
|||
| AS-20130120145606-02045.warc.gz |
2.8 GB
|
64.1 KB
|
|||
| MANIFEST.txt |
462.0 B
|